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ABSTRACT 


A major problem encountered in planning for Space Station 
Freedom is the amount of maintenance that will be required. 
To predict the failure rates of components and systems aboard 
Space Station Freedom, the logical approach is to use data 
obtained from previously flown spacecraft. In order to 
determine the mechanisms that are driving the failures, 
models can be proposed, and then checked to see if they 
adequately fit the observed failure data obtained from a 
large variety of satellites. For this particular study, 
failure data and truncation times were available for 
satellites launched between 1976 and 1984; no data past 1984 
was available. The study was limited to electrical 
subsystems and assemblies, which were studied to determine if 
they followed a model resulting from a mixture of exponential 
distributions . 
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INTRODUCTION 


In order to accurately estimate and reduce the amount of 
maintenance that will be required on Space Station Freedom r 
it is necessary to understand the mechanisms that cause the 
failures of its components and systems. Ideally formal life 
tests would be conducted, where a sample of each type of 
component would be put on test under environmental and 
operational conditions identical to those under which it is 
to be used, and the time to failure for each would be 
observed. Due to time constraints, it is not always possible 
to observe all of the items until they fail; in this case 
some type of censoring mechanism is employed. 

There are two basic types of censoring which have been 
extensively studied. Type I censoring occurs when n items 
are placed on test and observed for a fixed period of time t. 
Only the lifetimes of those which fail before time t are 
known; the others are said to be time censored or truncated. 
In this case the length of the test, t, is fixed, but the 
number of failures observed, r, is random. In Type II 
censoring, n items are placed on test and the test is 
terminated after the r^* 1 item fails. In this case, the number 
of failures, r, is fixed in advance, but the length of time 
of the test, t, is a random variable. Both of these types of 
censoring have been treated extensively in the literature. 

See, for example, Barlow and Proschan (1975), Bain (1978), 
and Lawless (1982) . For information on additional types of 
censoring, see McCool (1982), and Mann and Singpurwalla 
(1983) . 

Unfortunately, the situation encountered in building 
spacecraft is far from ideal. It is impossible to conduct 
meaningful life tests on earth because it is not possible to 
reproduce the microgravity environment. Thus to obtain data 
on failure of components in microgravity, it is necessary to 
turn to field data obtained from previously flown spacecraft. 
The situation is further complicated by the fact that the 
censoring taking place in this type of situation is neither 
Type I nor Type II. It is not determined in advance how many 
components will fail. Furthermore, the truncation times are 
not known in advance. The lifetime of one component may be 
truncated by the failure of another — for example, if the 
attitude control malfunctions, the satellite may fall out of 
orbit while all of its other components are still 
functioning; but none of these lifetimes are observable. 

Also, many components may still be operating at the last time 
they are observed. 

The case where both the number of failures and 
truncation times are both random variables has not been 
studied nearly as extensively. In fact, there is no 
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consensus on what it is called. Mann and Singpurwalla (1983) 
and Lawless (1982) refer to this general case as random 
censoring, while the entry under "Random Censoring" in the 
Encyclopedia of Statistical Sciences , Vol. 1 defines random 
censoring as a completely different situation. Nevertheless, 
the particular models given by Mann and Singpurwalla (1983) 
and Lawless (1982) are not applicable to the satellite data 
because they assume that failure times are identically 
distributed . 

The approach taken in this study to determine the 
mechanisms that produce the failures is to propose a 
mechanism, and then see how well the resulting model fits 
what is actually observed in the data. If the theoretical 
survival function differs significantly from the empirical 
survival function, then the proposed mechanism is not what is 
actually producing the failures. 


THE SATELLITE DATA 

The failure data for over 300 satellites was compiled by 
Planning Research Corporation and originally published in 
Bloomquist, et . al. (1978), with an update in 1984. This 
data is currently being compiled into a data base by Loral 
Space Information Systems. The data base includes each 
component of each satellite, classified by subsystem and 
assembly. The times for all failures are included, and 
truncation times for those which did not fail. 

The way the times were recorded was not consistent for 
all satellites. Often assemblies are turned off and on 
during the life of the satellite, so that they are not 
operating for the entire life of the satellite. Some 
elements, such as backup systems which are never needed, 
never get turned on at all. For some of the satellites, the 
times given were actual operating times for that assembly; 
for others, they were merely the time since launch, or 
"survival time." For the purposes of this study, operating 
times and survival times were treated separately. 

Since the data base was still being compiled at the time 
of this study, it was necessary to limit the number of 
satellites used in order to obtain some data. Since the more 
recent satellites are more likely to utilize the same type of 
technology as Space Station Freedom, only satellites launched 
since 1976 were considered. This was a total of 28 
satellites. Of these, only four had recorded operating 
times; the rest had recorded survival times. The data was 
further limited by considering only electronic assemblies, 
because the model considered was proposed for failure of 
electronic parts. 
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THE PROPOSED MODEL 


It is generally accepted that the lifetime of electronic 
components has an exponential distribution, that is 

f (t \X) = Xe-Xt, t £0, 


where X, is a parameter determined by the failure rate. The 
exponential distribution possesses a unique property it has 
a constant hazard rate. However, when considering many 
different types of electrical assemblies on different 
satellites, the odds are that the failure rates will not be 
the same for all of them. Thus the failure times do not 
represent a sample from a single exponential distribution, 
but rather from a mixture of exponentials with varying 
parameters. While each individual hazard rate is constant, 
it turns out that the hazard rate of the mixture is actually 
decreasing (see Barlow and Proschan (1975), p. 102). 

Hecht and Hecht (1985) analyzed the original PRC data 
and concluded that that the failures did indeed possess a 
decreasing hazard rate. Heydorn, et . al . (1991) proposed a 

model derived from a mixture of ordinary exponentials and 
demonstrated that its predictions were close to the results 
given in Hecht and Hecht (1985) . However, they did not have 
the original data, so their results depended on those given 
in the Hecht and Hecht report. This study uses the model 
proposed by Heydorn, et . al. and fits it to the actual 
satellite data. 

The model is obtained by assuming that the electronic 
components do possess an exponential distribution with 
parameter X. Thus for a fixed X the failures are generated 

by a Poisson process with parameter X. However, A. can be 
considered a random variable since the failure rates are not 
the same for all components. Taking the Bayesian approach 
and assuming that the prior distribution of X is uniform on 
(0, °o) , the posterior distribution of X is a gamma 
distribution, and the resulting reliability function t is 


R(t) 


1 


(1 


t \ a+i 

+ f) 


( 1 ) 


where a and T are parameters. The purpose of this study is 
to find the values of the parameters a and T which best fit 
the empirical survival function. 
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EMPIRICAL SURVIVAL FUNCTION 


The next step is to estimate the empirical survival 
function. If all of the failure times are known, then the 
estimated survival function is 

_ Number of observations £ t 
^ ( t ) — , t c 0 * 


If, however, some of the survival times are unknown, then 
this must be modified. Kaplan and Meier (1958) introduced an 
estimate called the product-limit (PL) estimate which 
handles the case of random censoring. The estimate is 

k 

. n (i -ii^i) . 

where k is the total number of events (including both 
failures and truncations) up to time t, N is the total number 

of events in the test, and <|>j is an indicator variable defined 
by 


{ 0 if the event is a truncation 
1 if the event is a failure 

The estimate is a product, where each term can be thought of 
as the conditional probability of surviving past time tj, 
given survival to just prior to tj, where tj is the time of 
event j. It is a step function, which steps down at each tj. 
For more information on the properties of the PL estimate, 
see Lawless (1982) and Peterson (1983) . 

The problem encountered with using the PL estimate of 
the reliability function is that when many of the largest 
times are truncations, none of the terms in the product 
approach zero, and hence the estimate yields only a small 
portion of the reliability function. For example, in the 
data set for satellite operating times, there were a total of 
165 observations, of which 14 were failures; the rest were 
truncations. Furthermore, the 82 largest times were 
truncations. This meant that the smallest term in product 
was 82/83. The PL estimate of the survival function for the 
operating time data is: 
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t 

S PL { t ) 

50 

0.9939 

70 

0.9878 

76 

0.9817 

264 

0.9756 

1440 

0.9633 

3890 

0.9572 

4300 

0.9447 

10500 

0.9367 

14980 

0.9285 

20747 

0.9203 

29361 

0.9120 

34526 

0-9009 


The data set for satellite survival times contained 907 
points, of which the last 154 were truncations. The PL 
estimate of the survival function for this data ranged from 
0.9989 to 0.8695. 


ESTIMATING THE PARAMETERS 


To find the estimates of a and T in Equation (1) which 
give the best fit to the PL estimate of the survival 
function, the first approach was to use the least squares 
criterion. Since (1) is a nonlinear function, the nonlinear 
least squares routine in S-plus was used. Unfortunately, the 
Jacobian matrix was nearly rank deficient, for a multitude of 
initial values, including the estimates derived in the other 
approach described below. Thus another approach had to be 
used to estimate the parameters. 

While the survival function given in (1) is nonlinear, 
it turns out that the inverse of the hazard function 
associated with it is a linear function. For a survival 
function S(t), the hazard function is 


h (t ) 


dlnS(t) 
dt ' 


The hazard function associated with the survival function 
given in (1) is 


h (t ) 


a + l 
T + t' 


and its inverse can be expressed as the linear 


function 
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1 

h(t) 


T 


( 2 ) 


+ V — — 

a + l a + l 

Thus the parameters a and T can be estimated by finding the 
empirical hazard function, taking its inverse, and using 
least squares to estimate the parameters of the resulting 
linear equation. These coefficients can then be used to 
solve for a and T in (2) . 

The empirical hazard function was estimated by taking 
the log of the PL estimate of the survival function, then 
taking successive differences in the resulting values and 
dividing by the width of the corresponding time interval to 
approximate the derivative. A linear model was then fit to 
the inverse of the empirical hazard function, and the 

parameters a and T were determined. 


RESULTS 

To begin the analysis, recall that if the mixed 
exponential model is correct, the failures should have a 
decreasing hazard function. Plots of the empirical hazard 
function for the operating time data and survival time data 
are given in Figures 1 and 2, respectively. Both plots are 
extremely noisy, and neither one definitely indicates a 
decreasing hazard function. No conclusions can be drawn from 
these plots. 

Plots of the empirical survival function and estimated 
theoretical survival function for the operating time and 
survival time data are given in Figures 3 and 4, 
respectively. It can be seen that neither of these 
demonstrates a very good fit. In particular, the empirical 
survival function initially drops at a much faster rate than 
the theoretical one for both sets of data. The data appears 
to display early failures, or "infant mortality" that is not 
adequately explained by the model. Based on these plots, the 
mixed exponential model does not seem to adequately explain 
the failures. A different model must be sought. 

In conclusion, this research indicates that a mixture of 
exponential distributions does not adequately explain 
electronic failures seen in previously flown satellites. In 
particular, it does not model the early failures very well. 

A different model will be needed to explain the mechanism 
generating the failures. 


23-8 


0.0012 



23-9 


Figure 1.- Empirical hazard function of operating time data 
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Figure 2.- Empirical hazard function of survival time data 
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Figure 3.- Estimated survival functions of operating time data 
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Figure 4.- Estimated survival function of survival time data 




REFERENCES 


Bain, L. J. (1978), Statistical Analysis of Reliability and 
Life Testing: Theory and Methods, New York: Marcel Dekker. 

Barlow, R. E., and Proschan, F. (1975), Statistical Theory of 
Reliability and Life Testing: Probability Models, New 

York: Holt, Rinehart, and Winston. 

Bloomquist, C., Anderson, V., Demars, D., Graham, W., 

Henruri, P., and Stiehl, G. (1978), "On-Orbit Spacecraft 
Reliability," PRC Report R-1863, September 30, 1978. 

Hecht, J., and Hecht, M. (1985), "Reliability Prediction For 
Spacecraft," RADC-TR-85-229, December 1985. 

Heydorn, R., Blumentritt, W. , Doran, L., and Graber, R. 

(1991), "A Model for Projecting the Number of Early 
Failures on Space Station Freedom," NASA/ JSC preliminary 
report . 

Kaplan, E. L., and Meier, P. (1958), "Nonparametric 

Estimation from Incomplete Observations," Journal of the 
American Statistical Association, 53, 457-481. 

Lawless, J. F. (1982), Statistical Models and Methods for 
Lifetime Data, New York: John Wiley & Sons. 

McCool, J. I. (1982), "Censored Data," in Encyclopedia of 

Statistical Sciences Vol. 1, eds.S. Kotz and N. L. Johnson, 
New York: John Wiley & Sons, PP . 389-396. 

Mann, N. R., and Singpurwalla, N. D. (1983), "Life Testing," 
in Encyclopedia of Statistical Sciences Vol. 4, eds.S. Kotz 
and N. L. Johnson, New York: John Wiley & Sons, PP . 632- 
639. 

Peterson, A. V. (1983), "Kaplan-Meier Estimator," in 

Encyclopedia of Statistical Sciences Vol. 4, eds.S. Kotz 
and N. L. Johnson, New York: John Wiley & Sons, PP . 346- 
351 . 


23-13 



